153 research outputs found

    A Probabilistic Spatial Distribution Model for Wire Faults in Parallel Network-on-Chip Links

    Get PDF
    High-performance chip multiprocessors contain numerous parallel-processing cores where a fabric devised as a network-on-chip (NoC) efficiently handles their escalating intertile communication demands. Unfortunately, prolonged operational stresses cause accelerated physically induced wearout leading to permanent metal wire faults in links. Where only a subset of wires may malfunction, enduring healthy wires are leveraged to sustain connectivity when a partially faulty link recovery mechanism is utilized, where its data recovery latency overhead is proportional to the number of consecutive faulty wires. With NoC link failure models being ultimately important, albeit being absent from existing literature, the construction of a mathematical model towards the understanding of the distribution of wire faults in parallel on-chip links is very critical. This paper steps in such a direction, where the objective is to find the probability of having a “fault segment” consisting of a certain number of consecutive “faulty” wires in a parallel NoC link. First, it is shown how the given problem can be reduced to an equivalent combinatorial problem through partitions and necklaces. Then the proposed algorithm counts certain classes of necklaces by making a separation between periodic and aperiodic cases. Finally, the resulting analytical model is tested successfully against a far more costly brute-force algorithm

    Memory hierarchy and data communication in heterogeneous reconfigurable SoCs

    Get PDF
    The miniaturization race in the hardware industry aiming at continuous increasing of transistor density on a die does not bring respective application performance improvements any more. One of the most promising alternatives is to exploit a heterogeneous nature of common applications in hardware. Supported by reconfigurable computation, which has already proved its efficiency in accelerating data intensive applications, this concept promises a breakthrough in contemporary technology development. Memory organization in such heterogeneous reconfigurable architectures becomes very critical. Two primary aspects introduce a sophisticated trade-off. On the one hand, a memory subsystem should provide well organized distributed data structure and guarantee the required data bandwidth. On the other hand, it should hide the heterogeneous hardware structure from the end-user, in order to support feasible high-level programmability of the system. This thesis work explores the heterogeneous reconfigurable hardware architectures and presents possible solutions to cope the problem of memory organization and data structure. By the example of the MORPHEUS heterogeneous platform, the discussion follows the complete design cycle, starting from decision making and justification, until hardware realization. Particular emphasis is made on the methods to support high system performance, meet application requirements, and provide a user-friendly programmer interface. As a result, the research introduces a complete heterogeneous platform enhanced with a hierarchical memory organization, which copes with its task by means of separating computation from communication, providing reconfigurable engines with computation and configuration data, and unification of heterogeneous computational devices using local storage buffers. It is distinguished from the related solutions by distributed data-flow organization, specifically engineered mechanisms to operate with data on local domains, particular communication infrastructure based on Network-on-Chip, and thorough methods to prevent computation and communication stalls. In addition, a novel advanced technique to accelerate memory access was developed and implemented

    A fine-grained link-level fault-tolerant mechanism for networks-on-chip

    No full text
    Silicon technology scaling is continuously enabling denser integration capabilities. However, this comes at the expense of higher variability and susceptibility to wear-out. With an escalating number of on-chip components expected to be defective in near-future chips, modern parallel systems, such as Chip Multi-Processors (CMP), become especially vulnerable to these faults. Just a single link failure in the underlying Network on-Chip (NoC) may cause inter-tile communication to halt and even deadlock, rendering the chip useless. While fault-tolerant routing schemes do exist, they can only handle a finite number of link faults. In this paper, we address permanent wire failures which can occur in on-chip parallel links at manufacture-time or while in operation. Instead of marking the entire link as faulty, we present a methodology where the Partially Faulty Link (PFL) can still be used to transfer data between NoC routers, thus maintaining network connectivity, extending the yield and lifetime of the chip, and allowing for graceful performance degradation. To achieve this, we devise architectural augmentations both to the router and link micro-architectures, along with link fault detection, diagnosis, and re-configuration at the level of wire granularity. Statistical link-level fault models present the usability of PFLs, while relevant load-balancing routing algorithms and low-cost re-transmission mechanisms are also presented and coupled to the proposed architecture. Hardware synthesis demonstrates the feasibility of the proposed extensions to the base NoC architecture. Results obtained from full-system simulations show that high-performance NoCs are realizable in the presence of PFL

    A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting

    No full text
    Denser transistor integration has enabled the fabrication of multi-tile chips, however, at the expense of higher susceptibility to defects and wear-out. Metal wires comprising the links of Networks-on-Chip (NoCs) are especially vulnerable to such defects, which can render some links disconnected. This paper presents a new fault-tolerant routing scheme to sustain on-chip communication. It uses a localized re-routing approach, whereby de-touring around faulty links -- or complex regions of faults -- is done locally at each node in a purely distributed and dynamic manner, while guaranteeing deadlock- and livelock-freedom. Results using synthetic traffic and real applications with full-system simulations prove its efficacy in addressing a large percentage of NoC links being faulty albeit at a gracefully degraded performance mod

    Dynamic fault-tolerant routing algorithm for networks-on-chip based on localised detouring paths

    No full text
    Downscaled complementary metal-oxide semiconductor (CMOS) technology feature sizes have enabled massive transistor integration densities. Multi-core chips with billions of transistors are now a reality. However, this rapid increase in on-chip resources has come at the expense of higher susceptibility to defects and wear-out. The inter-router communication links of networks-on-chips (NoCs) are composed of metal wires that are especially vulnerable to catastrophic physical effects such as those of electro-migration, which can even cause link disconnects. To address this hazard, fault-tolerant (FT) routing algorithms sustain on-chip communication by re-routing messages around faulty links, or regions. This work presents a new FT routing scheme that employs a localised re-routing approach. Packets are de-toured around faulty links/regions based on purely local and distributed decisions, and without any global link state knowledge. The algorithm, which is proven to be deadlock-and livelock-free, also handles dynamically occurring faults. Detailed evaluation with synthetic traffic patterns and real applications within a full-system simulation environment demonstrate the efficacy of the new scheme with up to 12% of NoC links being faulty. Synthesis results also prove the feasibility of the proposed protocol at modest hardware and power consumption overheads of only over 5 and 2.5%, respectively

    A probabilistic spatial distribution model for wire faults in parallel network-on-chip links

    No full text
    High-performance chip multiprocessors contain numerous parallel-processing cores where a fabric devised as a network-on-chip (NoC) efficiently handles their escalating intertile communication demands. Unfortunately, prolonged operational stresses cause accelerated physically induced wearout leading to permanent metal wire faults in links. Where only a subset of wires may malfunction, enduring healthy wires are leveraged to sustain connectivity when a partially faulty link recovery mechanism is utilized, where its data recovery latency overhead is proportional to the number of consecutive faulty wires. With NoC link failure models being ultimately important, albeit being absent from existing literature, the construction of a mathematical model towards the understanding of the distribution of wire faults in parallel on-chip links is very critical. This paper steps in such a direction, where the objective is to find the probability of having a "fault segment" consisting of a certain number of consecutive "faulty" wires in a parallel NoC link. First, it is shown how the given problem can be reduced to an equivalent combinatorial problem through partitions and necklaces. Then the proposed algorithm counts certain classes of necklaces by making a separation between periodic and aperiodic cases. Finally, the resulting analytical model is tested successfully against a far more costly brute-force algorithm

    A combinatorial application of necklaces: modeling individual link failures in parallel network-on-chip interconnect links

    No full text
    The advent of the multicore era [1, 2] has made the execution of more complex software applications more efficient and faster. On-chip communication among the processing cores, in the form of packetized messages, is managed with the use of on-chip networks (NoCs) [3]. Routers handling on-chip communication are point-to-point topologically interconnected using parallel links laid onto the silicon surface comprising a number of individual parallel wires. With the underlying interconnect structure becoming denser, due to improvements in CMOS technology, parallel links become susceptible to wear-out [4], with permanent link failures inhibiting communication completely and indefinitely. It is hence critical to explore their failure patterns in the wires comprising these links and hence build mechanisms which can recover corrupted in-transit data [5, 6]; since no real data from chip manufacturers exist, the derivation of a mathematical model in aiding the understanding of the distribution of individual wire faults in parallel on-chip links becomes mandatory. This paper takes the first steps in such direction. First it is shown how the given problem reduces to an equivalent combinatorial problem through partitions and necklaces. Then a model that counts certain classes of necklaces is derived by making a separation between periodic and aperiodic cases. The model is tested against a brute-force algorithm to prove its exactness. Finally the obtained model is used in finding the probability distribution of the size of the fault segment of wires in a parallel NoC-based multicore chip
    corecore